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Abstract 



The main focus of this paper is a pair of new approximation algorithms for certain integer programs. 
First, for covering integer programs {minca; : Ax > b, < x < d} where A has at most k nonzeroes per 
row, we give a fc-approximation algorithm. (We assume A, b, c, d are nonnegative.) For any k > 2 and 
e > 0, if P ^ NP this ratio cannot be improved to k — 1 — e, and under the unique games conjecture 
this ratio cannot be improved to k — e. One key idea is to replace individual constraints by others that 
have better rounding properties but the same nonnegative integral solutions; another critical ingredient 
is knapsack-cover inequalities. Second, for packing integer programs {maxci : Ax < b,0 < x < d} where 
A has at most k nonzeroes per column, we give a (2k 2 + 2)-approximation algorithm. Our approach 
builds on the iterated LP relaxation framework. In addition, we obtain improved approximations for the 
second problem when k — 2, and for both problems when every Aij is small compared to bi. Finally, 
we demonstrate a 17/16-inapproximability for covering integer programs with at most two nonzeroes per 
column. 

1 Introduction 

We investigate the following problem: what is the best possible approximation ratio for integer programs 
where the constraint matrix is sparse? To put this in context we recall a famous result of Lenstra [29]: 




integer programs with a constant number of variables or a constant number of constraints can be solved in 
polynomial time. Our investigations analogously ask what is possible if each constraint involves at most k 
variables, or if each variable appears in at most k constraints. 

Rather than consider all integer programs, we consider only packing and covering problems. Such pro- 
grams have only positive quantities in their parameters. One reason for this is that every integer program 
can be rewritten (possibly with additional variables) in such a way that each constraint contains at most 
3 variables and each variable appears in at most 3 constraints, if both positive and negative coefficients 
are allowed. Aside from this, packing programs and covering programs capture a substantial number of 
combinatorial optimization problems and are interesting in their own right. 

A covering (resp. packing) integer program, shorthanded as CIP (resp. PIP) henceforth, is an integer 
program of the form {min cx : Ax > b, < x < d} (resp. (maxci : Ax < b, < x < d}) with A, 6, c, d 
nonnegative and rational. Note that CIPs are sometimes called multiset multicover when A and b are integral. 
We call constraints x < d multiplicity constraints (also known as capacity constraints). We allow for entries 
of d to be infinite, and without loss of generality, all finite entries of d are integral. An integer program with 
constraint matrix A is k -row- sparse, or k-RS, if each row of A has at most k entries; we define k-column- 
sparse (k-CS) similarly. As a rule of thumb we ignore the case k = 1, since such problems trivially admit 
fully polynomial-time approximation schemes (FPTAS's) or poly-time algorithms. The symbol denotes 
the all-zero vector, and similarly 1 denotes the all-ones vector. For covering problems an a -approximation 
algorithm returns a feasible solution with objective value at most a times optimal; for packing, the algorithm 
returns a feasible solution with objective value is at least 1 /a times optimal. We use n to denote the number 
of variables and m the number of constraints (i.e. the number of columns and rows of A, respectively). 
Throughout the paper, A will be used as a matrix. We let Aj denote the jth column of A, and let a* denote 
the ith row of A. 
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1.1 £>Row-Sparse Covering IPs 

The special case of 2-RS CIP where A, b, c, d are 0-1 is the same as Min Vertex Cover, which is APX-hard. 
More generally, 0-1 fc-RS CIP is the same as fc-Bounded Hypergraph Min Vertex Cover (a.k.a. Set Cover 
with maximum frequency fc) which is not approximable to fc — 1 — e for any fixed e > unless P=NP [ ] 
(k — e under the unique games conjecture [22]). This special case is known to admit a matching positive 
result: set cover with maximum frequency k can be fc-approximated by direct rounding of the naive LP [15] 
or local ratio/primal-dual methods [2]. 

The following results are known for other special cases of fc-RS CIP with multiplicity constraints: 
Hochbaum [12] gave a fc-approximation in the special case that A is 0-1; Hochbaum et al. [ ] and Bar- 
Yehuda & Rawitz [3] gave pseudopolynomial 2-approximation algorithms for the case that fc = 2 and d is 
finite. For the special case d = 1, Carr et al. [5, §2.6] gave a fc-approximation, and Fujito & Yabuta ['■)] gave 
a primal-dual fc-approximation. Moreover [5, 9] claim a fc-approximation for general d, however, the papers 
do not give a proof and we do not see a straightforward method of extending their techniques to the general 
d case. Our first main result, given in Section 2, is a simple proof of the same claim. 

Theorem 1. There is a polynomial time fc- approximation algorithm for k-RS CIPs with multiplicity con- 
straints. 

Our approach is to first consider the special case that there are no multiplicity constraints (i.e. dj = +oo 
for all j); we then extend to the case of finite d via knapsack-cover inequalities, using linear programming 
(LP) techniques from Carr et al. [5]. A (fc+ l)-approximation algorithm is relatively easy to obtain using LP 
rounding; in order to get the tighter ratio fc, we replace constraints by other "Z + -equivalent" constraints (see 
Definition 8) with better rounding properties. The algorithm requires a polynomial-time linear programming 
subroutine. 

Independent simultaneous work of Koufogiannakis & Young [28, 26, 27] also gives a full and correct 
proof of Theorem 1. Their approach works for a broad generalization of fc-RS CIPs and runs in strongly 
polynomial time. Our approach has the generic advantage of giving new ideas that can be used in conjunction 
with other LP-based methods, and the specific advantage of giving integrality gap bounds (see Section 2.2). 

1.2 /c-Column-Sparse Packing IPs 

Before 2009, no constant-factor approximation was known for fc-CS PIPs, except in special cases. If every 
entry of b is ri(log m) then randomized rounding provides a constant- factor approximation. Demand matching 
is the special case of 2-CS PIP where (i) in each column of A all nonzero values in that column are equal 
to one another and (ii) no two columns have their nonzeroes in the same two rows. Shepherd & Vetta [33] 
showed demand matching is APX-hard but admits a — \/5)-approximation algorithm when d — 1; their 
approach also gives a ^-approximation for 2-CS PIP instances satisfying (i). Results of Chekuri et al. [ | 
yield a 11.542fc-approximation algorithm for fc-CS PIP instances satisfying (i) and such that the maximum 
entry of A is less than the minimum entry of b. 

The special case of fc-CS PIP where A, b are 0-1 is the same as min-weight k-set packing, hypergraph 
matching with edges of size < fc, and strong independent sets in hypergraphs with degree at most fc. The 
best approximation ratio known for this problem is (fc + l)/2 + e [4] for general weights, and fc/2 + e when 
c = 1 [ ]. The best lower bound is due to Hazan et al. [14], who showed Vl(k/ In fc)-inapproximability unless 
P=NP, even for c = 1. 

Our second main result, given in Section 3, is the following result. 

Theorem 2. There is a polynomial time (2k 2 + 2) -approximation algorithm for k-CS PIPs with multiplicity 
constraints. 

We use the iterated LP relaxation [ 14] technique to find an integral solution whose objective value is 
larger than the optimum, but violates some constraints. However the violation can be bounded. Then we 
use a colouring argument to decompose the violating solution into 0(k 2 ) feasible solutions giving us the 
0(fc 2 )-factor algorithm. 
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The original arXiv eprint and conference version [31] of this work gave a 0(fc 2 2 fc )-approximation for fc-CS 
PIP using iterated relaxation plus a randomized decomposition approach; that was the first approximation 
algorithm for this problem with ratio that depends only on k. Subsequently in April 2009, C. Chekuri, 
A. Ene and N. Korula (personal communication) obtained an 0(k2 k ) algorithm using randomized rounding, 
and an O(fc 2 )-approximation in May 2009. The latter method was independently re-derived by the authors, 
which appears in this version. Finally, Bansal et al. [ ], in August 2009, gave a simple and elegant 0(k)- 
approximation algorithm based on randomized rounding with a careful alteration argument. 

1.3 /c-Column-Sparse Covering IPs 

Srinivasan [35, 36] showed that fc-CS CIPs admit a 0(log fc)-approximation. Kolliopoulos and Young [24] 
extended this result to handle multiplicity constraints. There is a matching hardness result: it is NP-hard 
to approximate fc-Set Cover, which is the special case where A,b,c are 0-1, better than lnfc — O(lnlnfc) 
for any k > 3 [ ]. Hence for fc-CS CIP the best possible approximation ratio is O(logfc). A (k + e)- 
approximation algorithm can be obtained by separately applying an approximation scheme to the knapsack 
problem corresponding to each constraint. Although 0-1 2-CS CIP is Edge Cover which lies in P, general 
2-CS CIP is NP-hard due to Hochbaum [16], who also gave a bicriteria approximation algorithm. Here, we 
give a stronger inapproximability result. 

Theorem 3. For every e > it is HP-hard to approximate 2-CS CIPs of the form {minc-x | Ax>b,x is 0-1} 
and {mine • x \ Ax > b, x > 0,x integral} within ratio 17/16 — e even if the nonzeroes of every column of A 
are equal and A is of the block form [^] where each Ai is 1-CS. 

Our proof modifies a construction of [6]; we also note a construction of [33] can be modified to prove 
APX-hardness for the problem. 

1.4 Other Work 

The special case of 2-RS PIP where A,b,c are 0-1 is the same as Max Independent Set, which is not 
approximable within n/2 log3/ + " unless NP c BPTIME(2 los ° ( ' ™) [21]. On the other hand, rt-approximation 
of any packing problem is easy to accomplish by looking at the best singleton-support solution. A slightly 
better n/t- approximation, for any fixed i, can be accomplished by exhaustively guessing the t most profitable 
variables in the optimal solution, and then solving the resulting t-dimcnsional integer program to optimality 
via Lenstra's result [29]. 

A closely related problem is fc-Dimensional Knapsack, which are PIPs or CIPs with at most k constraints 
(in addition to nonnegativity and multiplicity constraints). For fixed fc, such problems admit a PTAS and 
pseudo-polynomial time algorithms, but are weakly NP-hard; see [20] and [32, Ch. 9] for detailed references. 

When d = 1, a natural way to generalize CIP/PIPs is to allow the objective function to be submodular 
(rather than linear). For minimizing a submodular objective subject to fc-row sparse covering constraints, 
the framework of Koufogiannakis & Young [28, 26, 27] gives a fc-approximation; if also A, b are 0-1 (i.e. 
submodular set cover) Iwata and Nagano [19] give a fc-approximation for all k and Goel et al. [11] give a 
2- approximation for k = 2. For maximizing a monotone submodular function subject to fc-column sparse 
packing constraints, the algorithm of Bansal et al. [1] gives a 0(fc)-approximation algorithm. 

1.5 Summary 

We summarize our results and preceding ones in Table 1; recall also the follow-up 0(k) approximation for 
fc-CS PIPs [1]. Note that in all four cases, the strongest known lower bounds are obtained even in the special 
case that A, b, c, d are 0-1. 
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fc-Column-Sparse 
lower bound upper bound 


fc-Row-Sparse 
lower bound upper bound 


Packing 
Covering 


00/ In A;) 2k 2 + 2 
lnfe- O(lnlnfc) O(lnfe) 


n l-o(l) m 

k-e k 



Table 1: The landscape of approximability of sparse integer programs. Our main results are in boldface. 



2 Approximation for £>Row-Sparse CIPs 

By scaling rows suitably and clipping coefficients that are too high (i.e. setting Ay = min{l, Ay}), we may 
make the following assumption without loss of generality. 

Definition 4. A fc-RS CIP is an integer program {mine • x : Ax > 1, < x < d, x € Z} where A is k-RS 
and all entries of A are at most 1. 

To begin with, we focus on the case dj = +oo for all j, which we call the unbounded k-RS CIP, since it 
illustrates the essence of our new technique. Let x be a n-dimensional vector of variables and a is a vector 
of real coefficients. Throughout, we assume coefficients are nonnegative. When we apply |_ - J to vectors we 
mean the component-wise floor. That is, the jth coordinate of \_a\ is \ otj\- 

Definition 5. A constraint a ■ x > 1 is p-roundable for some p > 1 if for all nonnegative real x, (a ■ x > 1) 
implies (a ■ \_px\ > I). 

Note that p-roundability implies p'-roundability for p' > p. The relevance of this property is explained 
by the following proposition. 

Proposition 6. If every constraint in an unbounded covering integer program is p-roundable, then there is 
a p- approximation algorithm for the program. 

Proof. Let x* be an optimal solution to the program's linear relaxation. Then c • x* is a lower bound on the 
cost of any optimal solution. Thus, [_px* \ is a feasible integral solution with cost at most p times optimal. □ 

We make another simple observation. 

Proposition 7. The constraint a ■ x > 1 is (1 + cti)-roundable. 

Proof. Let /0=(f + X]i a i)- Since \ t\ > t — 1 for any t, if a ■ x > 1 for a nonnegative x, then 
a ■ [px\ > } ai(pxj - I ) = p^^ctiXi - on > p - (p - 1 ) = f , 

i i i 

as needed. □ 

Now consider an unbounded /c-RS CIP. Since each constraint has at most k coefficients, each less than 
I , it follows from Proposition 7 that every constraint in these programs is (k + I)-roundable, and so such 
programs admit a (k + l)-approximation algorithm by Proposition 6. It is also clear that we can tighten the 
approximation ratio to k for programs where the sum of the coefficients in every constraint (row) is at most 
k — 1. We now show that rows with sum in (k — 1, k] can be replaced by other rows which are fc-roundable. 

Definition 8. Two constraints a ■ x > 1 and a' ■ x > I are Z + -equivalent if for all nonnegative integral x, 
{a- x > 1) & (a' ■ x > 1). 

In other words, replacing a constraint by an Z + -equivalent constraint doesn't affect the value of the CIP. 

Proposition 9. Every constraint a ■ x > 1 with at most k nonzero coefficients is Z+- equivalent to a k- 
roundable constraint. 



4 



Before proving Proposition 9, let us illustrate its use. 



Theorem 10. There is a polynomial time k- approximation algorithm for unbounded k-RS CIPs. 

Proof. Using Proposition 9 we replace each constraint with a Z + -equivalent fc-roundable one. The resulting 
IP has the same set of feasible solutions and the same objective function. Therefore, Proposition 6 yields a 
fc-approximately optimal solution. □ 

With the framework set up, we begin the technical part: a lemma, then the proof of Proposition 9. 

Lemma 11. For any positive integers k and v, the constraint X)i=i x i + v Xk — 1 * s k-roundable. 

Proof. Let a • x > 1 denote the constraint, i.e. a^ = —, a,- = 1 for 1 < i < k. If x satisfies the constraint, 
then the maximum of Xx, %2, ■ ■ ■ > ^fc-i and must be at least 1/k. If xi > 1/fc for some i ^ k then 
[kxi\ > 1 and so a ■ \ kx\ > 1 as needed. Otherwise x^ must be at least v/k and so [fca^J > v which implies 
a ■ [kx\ > 1 as needed. □ 

Proof of Proposition 9. If the sum of coefficients in the constraint is fc — 1 or less, we are done by Proposition 
7, hence we assume the sum is strictly greater than k — 1. Without loss of generality (by renaming) such a 
constraint is of the form 

k 

^ x. t a t > 1 (1) 

i=l 

where < a < 1, fe — 1< J2i a i — ^ an d the cti's are nonincreasing in i. 

Define the support of x to be supp(a;) :— {i \ xi > 0}. We claim that for any two distinct j, t, aj +ct£ > 1. 
Otherwise, the ^ ai < (k — 2) + 1 = k — 1. Thus, for any feasible integral x with | supp(x)| > 2, we have 
a ■ x > 1. To express the set of all feasible integral solutions, let t be the maximum i for which oti = 1 (or 
t = if no such i exists), let denote the ith unit basis vector, and let v = [l/a^]. Then it is not hard to 
see that the nonnegative integral solution set to constraint (1) is the disjoint union 

{x | x > 0, | supp(ir)| > 2} 1+1 {ze t \ 1 < i < t, z > 1, z G Z} 
l±J{zei | t < i < k, z > 2, z e Z} 1+1 {ze fc | z > u, z e Z}. 

The special case £ = fc (i.e. ax = ol-x = • • • = a/. = 1) is already fc-roundable by Lemma 11, so assume t < k. 
Consider the constraint 

V ^— ari + -a;fc>l. (3) 
z — / z — / w w 

i=l i=t+l 

Every integral a; > with | supp(x)| > 2 satisfies constraint (3). By also considering the cases | supp(x)| e 
{0, 1}, it is easy to check that constraint (3) has precisely Equation (2) as its set of feasible solutions, i.e. 
constraint (3) is Z + -equivalent to ax > 1. If t < k — 1, the sum of the coefficients of constraint (3) is fc — 1 
or less, so it is fc-roundable by Proposition 7. If t = k — 1, constraint (3) is fc-roundable by Lemma 11. Thus 
in either case we have what we wanted. □ 



2.1 Multiplicity Constraints 

We next obtain approximation guarantee fc even with multiplicity constraints x < d. For this we use knapsack- 
cover inequalities. These inequalities represent residual covering problems when a set of variables is taken 
at maximum multiplicity. Wolsey [38] studied inequalities like this for 0-1 problems to get a primal-dual 
approximation algorithm for submodular set cover. The LP we use is similar to what appears in Carr et al. 
[5] and Kolliopoulos & Young [24], but we first replace each row with a fc-roundable one. 

Specifically, given a CIP {mine • x \ Ax > 1, < x < d, x € Z} with A, d nonnegative, we now define 
the knapsack cover LP. Note that we allow d to contain some entries equal to +00; if dj = +00 and some i 
has Aij = our convention is that Aijdj = 0. Recall, ai is the ith row of A and supp(aj) denotes the set 
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{j : Aij > 0}. For a subset F of supp(ai) such that J2jeF A v^j < 1, define = mm{Aij, l—J2jeF A ijdj}- 
Following [5, 24] we define the knapsack cover LP for our problem to be 



| mi 



KC-LP = <^ min c • x : < x < d: 



Vi, VT C supp(fli) s.t. ^2 A v d J < 1 : E A iP x i - 1 - E Mjdj\. 



jeF j£F jeF 



It is not too hard to check that any integral solution to the CIP satisfies the constraints of KC-LP, and 
thus the solution to the latter is a lower bound on the value of the CIP. 

Theorem 1. There is a polynomial time k- approximation algorithm for k-RS CIPs. 

Proof. Using Proposition 9, we assume all rows of A are fc-roundable. Let x* be the optimal solution to 
KC-LP. Define x = min{ci, where min denotes the component-wise minimum. We claim that a; is a 

feasible solution to the CIP, which will complete the proof since the objective value of x is at most k times 
the objective value of KC-LP. In other words, we want to show for each row i that cti • x > 1. 

Fix any row i and define F = {j € supp(ai) | x* > dj/k}, i.e. F is those variables in the constraint that 
were rounded to their maximum multiplicity. If F = then, by the fc-roundability of <Zj • x > 1, we have 
that a,i ■ x = di ■ [kx* \ > 1 as needed. So assume F ^= 0. Note that for j £ F, we have Xj = dj and for 
j 4- F, we have Xj = [kx*\ . 

If ^2j £F Aijdj > 1 then the constraint a, • x > 1 is satisfied; consider otherwise. Since \_kx*\ > kx* — I 

(F) 

for j £ F, since x* satisfies the knapsack cover constraint for i and F, and since A\j < 1 — J2j E p Aijdj for 
each j, we have 

E 4f } %- = E lki > * E 4f >*; - E 4f 

J0F j^F j£F j?F 

- k ( 1 ~ E ^Mj) " |0' : J e SU PPK)\ F >| ( x - E Ai i d i 
jeF jeF 



{j : j £ supp(fli )\f}|(l 



jeF ' " jeF 

Since F ^ and | supp(a;)| < k, this gives Yl,j^F A if^j — — EjeF A v^j- Rearranging, and using the 
fact (Vj : A^ > A^'), we deduce a, • 5; > 1, as needed. 

For fixed fc, we may solve KC-LP explicitly, since it has polynomially many constraints. For general fc, no 
method is currently known to solve KC-LP in polynomial time. However, one can use the ellipsoid method 
to find a solution x* whose objective is lower than that of KC-LP, and which satisfies the knapsack-cover 
constraints corresponding to the set F — { j : x* > dj/k}. Note that this is all we need for the above analysis. 
Details of how the ellipsoid method finds such a solution are given in [5, 24]. □ 



2.2 Integrality Gap Bounds 

In discussing integrality gaps for fc-RS CIP problems, we say that the naive LP relaxation of {mine • x \ 
Ax > b, < x < d, x £ Z} is the LP obtained by removing the restriction of integrality. Earlier, we made 
the assumption that A^ < bi for all i,j; let us call this the clipping assumption. The clipping assumption 
is without loss of generality for the purposes of approximation guarantees, however, it affects the integrality 
gap of the naive LP for unbounded fc-RS CIP, as we now illustrate. Without the clipping assumption, the 
integrality gap of fc-RS CIP problems can be unbounded as a function of fc; indeed for any integer M > 1 
the well-known covering problem {minxi | [M]xi > 1,0 < x\} has integrality gap M. In instances with 
the clipping assumption and without multiplicity constraints, the previous methods in this section establish 
that the integrality gap of the naive LP is at most fc + 1. 
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Even under the clipping assumption, it is well-known that fc-RS CIPs with multiplicity constraints can 
have large integrality gaps — e.g. {minx 2 | \m\ x — M + 1, < x, X\ < 1} has integrality gap M. 
For bounded instances, the knapsack-cover inequalities represent a natural generalization of the clipping 
assumption, namely, we perform a sort of clipping even considering that any subset of the variables are 
chosen to their maximum extent. 

We have seen that KC-LP has integrality gap at most fc + 1 on fc-RS CIP instances. Our methods also 
show that if we replace each row with a fc-roundable one (Proposition 9), then the corresponding KC-LP 
has integrality gap at most fc. We are actually unaware of any fc-RS CIP instance with fc > 1 where the 
integrality gap of KC-LP (without applying Proposition 9) is greater than fc; resolving whether such an 
instance exists would be interesting. Some special cases are understood, e.g. Koufogiannakis and Young [27] 
give a primal-dual fc-approximation for fc-CS PIP in the case A is 0-1, also known as hypergraph 6-matching. 

3 Column-Sparse Packing Integer Programs 

In this section we give an approximation algorithm for fc-column-sparse packing integer programs with 
approximation ratio 2fc 2 + 2. We better results for fc = 2, and for programs with high width (we defer the 
definition to a later subsection). The results hold even in the presence of multiplicity constraints x < d. 
Broadly speaking, our approach is rooted in the demand matching algorithm of Shepherd & Vetta [33]; their 
path- augmenting algorithm can be viewed as a restricted form of iterated relaxation, which is the main tool 
in our new approach. Iterated relaxation yields a solution whose objective value is larger than the optimum, 
however, the solution violates some constraints. We then decompose this infeasible solution to a collection 
of feasible solutions while retaining at least a constant fraction of the objective value. 

For a fc-CS PIP V let C{V) denote its linear relaxation {maxc • x \ Ax < b, < x < d}. We use the set 
/ to index the constraints and J to index the variables in our program. We note a simple assumption that 
is without loss of generality for the purposes of obtaining an approximation algorithm: Aij < bi for all 
To see this, note that if Aij > bi, then every feasible solution has Xj = and we can simply delete Xj from 
the instance. 

Now we give our iterated rounding method. Let the term entry mean a pair {i,j) G / x J such that 
A^ > 0. Our iterated rounding algorithm computes a set S of special entries; for such a set we let As^-o 
denote the matrix obtained from A by zeroing out the special entries. 

Lemma 12. Given a k-CS PIP V , we can, in polynomial time, find S and nonnegative integral vectors 
x° ,x 1 with x° + x 1 < d and x 1 < 1 such that 

(a) c-(x°+x 1 ) > OPT(£(P)) 

(b) Vi G I, we have \{j : (i,j) G S}\ < fc 

(c) Ax° + Ag^ox 1 < b. 

In particular, since x 1 is 0-1, (x + x 1 ) is a solution such that for each row i, we have aj • {x + x l ) < 
b L + fc maxj Aij . We now give the proof of the above lemma. 

Proof of Lemma 12. First, we give a sketch. Recall that Aj denote the jth column of A and a, denotes the 
ith row of A. Let supp(A,-) := {i 6 1 1 Aij > 0}, which has size at most fc, and similarly supp(ai) := {j G J \ 
Aij > 0}. Let x* be an extreme optimal solution to C{V). The crux of our approach is as follows: if x* has 
integral values we have made progress. If not, x* is a basic feasible solution so there is a set of supp(x*) = \ J\ 
linearly independent tight constraints for x* , so the total number of constraints |/| satisfies |/| > |J|. By 
double-counting there is some i £ I with | supp(<2j)| < fc, which is what permits iterated relaxation: we 
discard the constraint for i and go back to the start. 

Figure 1 contains pseudocode for our iterated rounding algorithm, IteratedSolver. 

Now we explain the pseudocode. The x° term can be thought of as a preprocessing step which effectively 
reduces the general case to the special case that d = 1. The term x 1 G {0, 1} grows over time. The set J' 
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IteratedSolver(A, b, c, d) 

l: Let x* be an extreme optimum of {max ex | x G R/; < x < d; Ax < b} 

2: Let x° = [x*\ , x 1 = 0, J' = {j G J I x* Z}, I' = I, S = 0. 

3: loop 

4: Let a;* be an extreme optimum of {maxcx | x G [0, 1] J ; Ax° + Ag^ (x + x 1 ) < b} 

5: For each j G J' with x* = 0, delete j from J' 

6: For each j G J' with = 1, set Xj=l and delete j from J' 

7: If J' = 0, terminate and return S, x , x 1 

8: for each is/' with | supp(ai) n J'| < k do 

9: Mark each entry \ j € supp(ai) n J'} special and add it in S and delete i from J' 

10: end for 

11: end loop 



Figure 1: Algorithm for k-CS PIP. 

represents all j that could be added to x 1 in the future, but have not been added yet. The set I' keeps track 
of constraints that have not been dropped from the linear program so far. 

Since x* is a basic feasible solution we have |/'| > | J'\ in Step 8. Being /c-CS, each set | supp(A,) n I'\ for 
j G J' has size at most k. By double-counting, Yliel' I su PP( a ») ^ J'\ — k\J'\ — an d so some i G I' has 
| supp(ai) H J'| < k. Thus |7'| decreases in each iteration, and the algorithm has polynomial running time. 
(In fact, it is not hard to show that there are at most (9(fclog |/|) iterations.) 

The algorithm has the property that c • (x° + x 1 + x*) does not decrease from one iteration to the 
next, which implies property (a). Properties (b) and (c) can be seen immediately from the definition of the 
algorithm. □ 

Now we give the proof of the main result in this section. Here and later we abuse notation and identify 
vectors in {0, l}" 7 with subsets of J, with 1 representing containment. That is, if we have two 0, 1 vectors y 
and x we let y C x denote the fact that yi — 1 implies Xi = 1. 

Theorem 2. There is a polynomial time (2k 2 + 2) -approximation algorithm for k-CS PIPs with multiplicity 
constraints. 

Proof. We use Lemma 12 to obtain a; and x . The main idea in the proof is to partition the set x 1 into 
2k 2 + 1 sets which are all feasible (i.e., we get x 1 = X^=i +1 2/ J f° r 0-1 vectors y^ each with Ay^ < b). If we 
can establish the existence of such a partition, then we are done as follows: the total profit of the 2fc 2 + 2 
feasible solutions a; , y , . . . , y 2k +1 is c- (x° + X 1 ) > OPT, so the most profitable is a (2k 2 + 2)-approximately 
optimal solution. 

Call G x 1 in conflict at i if A.- L j > 0, Ay/ > and at least one of (i,j) or (i,j') is special. We claim 
that if y G x 1 and no two elements of y are in conflict, then y is feasible; this follows from Lemma 12(c) 
together with the fact that Aij < bi for all (Explicitly, for each constraint we either just load it with 
a single special entry, or all non-special entries, both of which are feasible.) In the remainder of the proof, 
we find a (2k 2 + l)-colouring of the set x 1 such that similarly-coloured items are never in conflict; then the 
colour classes give the needed sets yi and we are done. 

To find our desired colouring, we create a conflict digraph which has node set x 1 and an arc (directed 
edge) from j to j' whenever j,f are in conflict at i and (i,j) is special. Rewording, there is an arc (j,j') iff 
some (i,j) G S and Ay> > 0. (If (i,j') is also special, this also implies an arc (j',j)-) The key observation 
is that each node j G x 1 has indegree bounded by k 2 , i.e. there are at most k 2 choices of j such that (j, j') 
is an arc: to see this note | Aij> > 0} < k, and each i in this set has #{j | (i,j) G S} < k. Now we use 
the following lemma, which completes the proof. 

Lemma 13. A digraph with maximum indegree d has a2d+ 1-colouring. 
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Proof. We use induction on the number of nodes in the graph, with the base case being the empty graph. 
Now suppose the graph is nonempty. The average indegree is at most d, and the average indegree equals the 
average outdegree. Hence some node n has outdegree at most the average, which is d. In total, this node 
has at most 2d neighbours. By induction there is a (2c? + l)-colouring when we delete n, then we can extend 
it to the whole digraph by assigning n any colour not used by its neighbours. □ 

(We remark that Lemma 13 is tight, e.g. arrange 2d + 1 vertices on a circle and include an arc from each 
vertex to its d clockwise- next neighbours; this directed K 2 d+i cannot be 2ei-coloured.) This ends the proof 
of Theorem 2. □ 

3.1 Improvements for k = 2 

We give some small improvements for the case k — 2, using some insights due to Shepherd & Vetta [33]. A 2- 
CS PIP is non-simple if there exist distinct with supp(A, ) = supp(A,/) and | supp(A,)| = 2. Otherwise, 
it is simple. Shepherd and Vetta consider the case when all non-zero entries of a column are equal. Under 
this assumption, they get a 3.5 approximation for 2-CS PIPs, and a ^ — \/5 ~ 3.26 approximation for such 
simple 2-CS PIPs, when d = 1. We extend their theorem as follows. 

Theorem 14. There is a deterministic 4- approximation algorithm for 2-CS PIPs. There is also a randomized 
6 — v5 ss 3.764- approximation algorithm for simple 2-CS PIPs with d = 1. 

(Sketch). Since we are dealing with a 2-CS PIP, each supp(Aj) is an edge or a loop on vertex set I; we abuse 
notation and directly associate j with an edge/loop. Consider the initial value of J', i.e. after executing Step 
2. Then we claim that the graph (7, J') has at most one cycle per connected component; to see this, note 
that any connected component with two cycles would have more edges than vertices, which contradicts the 
linear independence of the tight constraints for the initial basic solution x*. 

We modify IteratedSolver slightly. Immediately after Step 2, let M C J' consist of one edge from 
each cycle in (I, J'), and set J' := J'\M. Then M is a matching (hence a feasible 0-1 solution) and the new 
J' is acyclic. Modify the cardinality condition in Step 8 to | supp(ai) fl J'\ < 1 (instead of < 2); since J' is 
acyclic, it is not hard to show the algorithm will still terminate, and Vi € I, we have \{j : (i,j) € S}\ < 1. 

To get the first result, we use a colouring argument from [33, Thm. 4.1] which shows that a: 1 can be 
decomposed into two feasible solutions x 1 = y + y 2 . We find that the most profitable of a; , M, y 1 , y 2 is a 
4-approximately optimal solution. 

For the second result, we instead apply a probabilistic technique from [ 1 : , §4.3]. They define a distribution 
over subsets of the forest x 1 ; let z be the random variable indicating the subset. Let p = ^(5 + Vh). Say 
that an edge ii' is compatible with z if z neither contains an edge with a special endpoint at i, nor at i'. 
The distribution has the properties that z is always feasible for the PIP, Pr[j g z] = p for all j G x 1 , and 
Pr[supp(A,) compatible with z] > p for all j £ x°. (Simplicity implies that a; and x have no edge in 
common, except possibly loops, which is needed here.) Finally, let w denote the subset of x° compatible 
with z. Then z + w is a feasible solution, and E[c(z + w)] > pc(x + x°). Hence the better solution of z + w 
and M is a 1 + 1/p — (6 — v / 5)-approximately optimal solution. □ 

3.2 Improvements For High Width 

The width W of an integer program is miny bi/Aij, taking the inner term to be +co when Aij — 0. Note 
that without loss of generality, W > 1. From now on let us normalize each constraint so that 6^ = 1; then a 
program has width > W iff every entry of A is at most 1/W . 

In many settings better approximation can be obtained as W increases. For example in fc-RS CIPs with 
6=1, the sum of each row of A is at most k/W, so Propositions 6 and 7 give a (1 4- fc/W^-approximation 
algorithm. Srinivasan [35, 36] gave a (1 + ln(l + fc)/M / )-approximation algorithm for unbounded fc-CS 
CIPs. Using grouping and scaling techniques introduced by Kolliopoulos and Stein [23], Chekuri et al. 
7] showed that no-bottleneck demand multicommodity flow in a tree, and certain other problems, admit 
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approximation ratio 1 + 0(1/ y/W). Multicommodity flow in a tree (without demands) admits approximation 
ratio 1 + 0(1/ W) [25]. Motivated by these results, we will prove the following theorem. 



Theorem 15. There is a polynomial time 1 
with W > k. 



Ik 
W-k 



^-approximation algorithm to solve k- column- sparse PIPs 



For W > 2k, Theorem 15 implies a 1 + 0(/c/W / )-approximation. For fixed k > 4 and large W this is 
asymptotically tight since l + o(l/H / )-approximation is NP-hard, by results from [10, 25] on multicommodity 
flows in trees. After the initial publication of Theorem 15 [31], Bansal et al. [1] gave an algorithm with ratio 
16e • /fc 1 /^ , where e = 2.718.... 

Proof of Theorem 15. Run IteratedSolver. From Lemma 12 we see that c • (x° + x 1 ) > OPT and, using 
the width bound, 

A(x° +X 1 ) < (1 + k/W)l. (4) 

Define V(x) by V(x) :— {i £ I | a, ■ x > 1}, e.g. the set of violated constraints in Ax < 1. 

We want to reduce (x + x 1 ) so that no constraints are violated. In order to do this we employ a linear 
program. Let %(•) denote the characteristic vector. Our LP, which takes a parameter x, is 



7Z(x) : maxjcx | < x < x, Ax < 1 



k 

W 



x(V(x))}- 



We can utilize this LP in an iterated rounding approach, described by the following pseudocode. 



IteratedReducer 



Let x := x° + x 1 
while V(x) ^ do 

Let x* be an extreme optimum of lZ(x) 

Let x — \x*~\ 
end while 



We claim that this algorithm terminates, and that the value of cx upon termination is at least 

1-k/W , n 1s 1-k/W ^ 

-c- (x° + X 1 ) > tAtttOPT. 



1 + k/W 



1 + k/W 



Once we show these facts, we are done, since the for the final x, V(x) = implies x is feasible. As an initial 
remark, note that each coordinate of x is monotonically nonincreasing, and so V(x) is also monotonically 
nonincreasing. 

Observe that 1Z in the first iteration has \~^//w + xl ) as a feasible solution, by Equation (4). Next, 
note that x which is feasible for 1Z in one iteration is also feasible for TZ in the next iteration since V(x) is 
monotonically nonincreasing; hence the value of c • x* does not decrease between iterations. 

To show the algorithm terminates, we will show that V(x) becomes strictly smaller in each iteration. Note 
first that if i ^ V(x), the constraint a 2 ; • x < 1 is already implied by the constraint x < x. Hence 7Z(x) may 
be viewed as having only | V(x)| many constraints other than the box constraints < x < x. Then x, a basic 
feasible solution to 1Z(x), must have at most |V(x)| non-integral variables. In particular, using the fact that 
the program is fc-CS, by double counting, there exists some i £ V(x) such that #{j | x* £ Z, Ay > 0} < k. 
Thus (using the fact that all entries of A are at most 1/W) we have • \x*~\ < aj • x* + k(l/W) < 1: so 
i $ V(\x*~\), and V(x) is strictly smaller in the next iteration, as needed. □ 
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4 Hardness of Column-Restricted 2-CS CIP 



Theorem 3. It is NP-hard to approximate 2-CS CIPs of the form {mincx | Ax >b,x is 0-1} and {mincx | 
Ax > b, x > 0, x integral} within ratio 17/16 — e even if the nonzeroes of every column of A are equal and A 
is of the block form [^] w here each A4 is 1-CS. 

Proof. Our proof is a modification of a hardness proof from [6] for a budgeted allocation problem. We focus 
on the version where x is 0-1; the other version follows similarly with only minor modifications to the proof. 
The specific problem described in the statement of the theorem is easily seen equivalent to the following 
problem, which we call demand edge cover in bipartite multigraphs: given a bipartite multigraph (V, E) 
where each vertex v has a demand b v and each edge e has a cost c e and value d e , find a minimum-cost set E' 
of edges so that for each vertex v its demand is satisfied, meaning that J2 e eE'nS(v) d e > b v . Our construction 
also has the property that c e = d e for each edge — so from now on we denote both d e . 

The proof uses a reduction from Max-3-Lin(2), which is the following optimization problem: given a 
collection {xi}i of 0-1 variables and a family of three- variable modulo-2 equalities called clauses (for example, 
Xi +X2 + X3 = 1 (mod 2)), find an assignment of values to the variables which satisfies the maximum number 
of clauses. Hastad [ ] showed that for any e > 0, it is NP-hard to distinguish between the two cases that (1) 
a (1 — e) fraction of clauses can be satisfied and (2) at most a (1/2 + e) fraction of clauses can be satisfied. 

Given an instance of Max-3-Lin(2) we construct an instance of demand edge cover as follows. For each 
variable Xi there are three vertices u Xi = 0" and "a;, = 1"; these vertices have 6- value 4deg(xi) where 

deg(xi) denotes the number of clauses containing Xi. For each clause there are four vertices labelled by the 
four assignments to its variables that do not satisfy it; for example for the clause x\ + X2 + X3 = 1 (mod 2) 
we would introduce four vertices, one of which would be named u xi = 0,X2 = 0,x% — 0." These vertices 
have b- value equal to 3. Each vertex "xi = C" is connected to "xj" by an edge with d- value 4deg(x,); each 
vertex v of the form "2^ = Ci, Xi 2 — C%,Xi 3 — C3" is incident to a total of nine edges each with d- value 1: 
three of these edges go to "x^ — Cj" for each j — 1,2,3. The construction is illustrated in Figure 2. 

Let m denote the total number of clauses; so J2i deg(xi) = 3m. We claim that the optimal solution to 
this demand edge cover instance has cost 24m + 3t where t is the least possible number of unsatisfied clauses 
for the underlying Max-3-Lin(2) instance. If we can show this then we are done since Hastad's result shows 
we cannot distinguish whether the optimal cost is > 24m + 3m(l/2 — e) or < 24m + 3(em); this gives an 
inapproximability ratio of 24 ^^|~ 3e = 17/16 — e' for some e' > such that e' — > as e — > 0, which will 
complete the proof. 

Let x* denote a solution to the Max-3-Lin(2) instance with t unsatisfied clauses; we show how to obtain 
a demand edge cover E' of cost 24m + it. We include in E' the edge between tt Xi" and "a;, = x" for each 
i; this has total cost J^i 4deg(iEi) = 12m. For each satisfied clause Xi + Xj = C (mod 2), we include in 
E' all three edges between "xi = 1 — x*" and u Xi = 1 — x*, Xj = x*,Xk = xl" and similarly for j, k, and one 
of each of the parallel triples incident to u Xi — 1 — x*,Xj = 1 — x*,Xk = 1 — x^" ; this has cost 12 for that 
clause. For each unsatisfied clause Xi + Xj + Xk = C (mod 2), we include in E' any three unit-cost edges 
incident to "xi — x*,Xj — x*,Xk — x^," as well as twelve more unit-cost edges: namely in the six nodes 
consisting of "xi = 1 — x*," "xi — I ~ x*,Xj — 1 — x*,Xk = x* k v and their images under swapping i with j 
and k, the induced subgraph is a 6-cycle of parallel triples, and we take two edges out of each triple. Thus 
the chosen edges have total cost 15 for that clause. It is not hard to see that this solution is feasible — e.g. 
vertices of the form "xi — 1 — x*" are covered by 4 edges for each clause containing them. The total cost is 
c(E') = 12m + 12(m -t) + 15t = 24m + 3t. 

To finish the proof we show the following. 

Claim 16. Given a feasible demand edge cover E' , we can find a solution x* such that t, the number of 
unsatisfied clauses for x* , satisfies 24m + it < c(E'). 

Proof. First we claim it is without loss of generality that for each i, E' contains exactly one of the edges 
incident to "x(" . Clearly at least one of these two edges lies in E'\ if both do, then remove one (say, the edge 
between "x{" and u Xi = 0") and add to E' any subset of the other 6deg(xi) edges incident to u Xi = 0" so 
that the total number of edges incident on "xi = 0" in E' becomes at least 4deg(xi). The removed edge has 
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Figure 2: Left: the gadget constructed for each variable Xi. The vertices shown as rectangles have b- value 
4deg(xi); the thick edges have d- value and cost 4deg(xi). Right: the gadget constructed for the clause 
Xi + Xj + Xk = (mod 2). The vertices shown as rounded boxes have b- value 3; the thin edges each have 
unit d- value and cost. 



d-value 4deg(x i ) and all other incident edges have d-valne 1, so clearly the solution is still feasible and the 
cost has not increased. 

Define x* so that for each i, E' contains the edge between "a;," and u Xi = x*." Let E" denote the edges 
of E' incident on clause vertices (i.e. the edges of E' with unit d- value). For F C E" their left-contribution, 
denoted £(F), is the number of them incident on vertices of the form "a;, = 1 — x*." Note that 1(F) < \F\ for 
any F. Furthermore for each unsatisfied clause, all edges incident on its vertex Xi X^ , Xj — Xj , Xk — Xfc 
have zero left-contribution, but E' contains at least three of these edges. Thus the edges of E" incident on 
that clause's vertices have £(F) < \F\ — 3. Finally, consider £(E"). Each edge of E" is in the gadget for a 
particular clause, and it follows that £(E") < \E"\ — 3t where t is the number of unsatisfied clauses for x* . 
However, E" needs to have 4deg(a;i) edges incident on each "a;, = 1 — x*" so £(E") > J2i^^ e s( x i) = 12m- 
Thus \E"\ > 12m+3t and considering the edges incident on the vertices u x" we see that c(E') > 24m+3i. □ 

This completes the proof of the reduction. □ 



5 Open Problems 

It is natural to conjecture that fc-CS CIP with a submodular objective admits an approximation ratio 
depending only on k, perhaps O(lnfc) matching the best ratio known for linear objectives. 
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Although 2-RS IPs are very hard to optimize (at least as hard as Max Independent Set), the problem of 
finding a feasible solution to a 2-RS IP is still interesting. Hochbaum et al. [17] gave a pseudopolynomial- 
time 2-SAT-based feasibility algorithm for 2-RS IPs with finite upper and lower bounds on variables. They 
asked if there is a pseudopolynomial-time feasibility algorithm when the bounds are replaced by just the 
requirement of nonnegativity, which is still open as far as we know. It is strongly NP-hard to determine if 
IPs of the form {x > | Ax = b} are feasible when A is 2-CS [16], e.g. by a reduction from 3-Partition; but 
for IPs where each variable appears at most twice including in upper/lower bounds, it appears all that is 
known is weak NP-hardness (for example, via the unbounded knapsack problem [30]). 
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